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Abstract. Our recent minimal model of cooperation (P. Gawronski et 
al, Physica A 388 (2009) 3581) is modified as to allow for time-dependent 
altruism. This evolution is based on reputation of other agents, which 
in turn depends on history. We show that this modification leads to two 
absorbing states of the whole system, where the cooperation flourishes 
in one state and is absent in another one. The effect is compared with 
the results obtained with the model of indirect reciprocity, where the 
altruism of agents is constant. 
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1 Introduction 

The Prisoner's Dilemma (PD) is a canonical example of game, where mutual 
cooperation is not profitable for an individual player and simultaneously it is 
profitable for a society. This paradoxical aspect and the wide set of situations 
where PD applies makes it a central point in game theory [1]. On the sociolog- 
ical side, PD is considered as one of three generic games which represent three 
paradigmatic interaction situations; coordination and inequality are two others 
[5]. As a rule, PD is investigated in the frames of game theory, i.e. decisions 
are made on the basis of calculated payoffs. However, many scholars indicated 
that these frames are too narrow [2I3I4I5| . In particular in [6], an opposition 
of two archetypal players has been presented: Homo Economicus - a creature 
who is rational and purely self-regarding, and Homo Sociologicus - a creature 
who follows prevailing social norms, and both concepts have been stated to be 
erroneous as one-sided. The truth, if accessible, is supposed to be in between. 
Still, any receipt how to weight these two attitudes in a given society remains 
arbitrary. It seems obvious that any classification of this kind is improved when 
we accept a gradual scale. 



Here we are interested in a model of possible eooperation where both as- 
pects arc captured, the normative one and the rational one. Our starting point 
is the social mechanism of competitive altruism |7I8| . when individuals compete 
for most altruistic partners; in this competition, altruistic behaviour is a signal. 
This mechanism was the basis of our previous minimal model of cooperation, 
where players differed in altruism; the latter was defined as a willingness to coop- 
erate [5]. The agents' behaviour was encoded in the form of their time-dependent 
reputation. Our nice result was that altruistic players were rewarded by cooper- 
ation of other agents. In this way, the classical deficiency of altruistic strategy - 
being abused by selfish free-riders - was evaded in our model. 

Simple as it was formulated, the model of cooperation [5] is indeed minimal 
in the sense that its outcome is obtained with a minimal number of assump- 
tions. The aim of this paper is to built into the model the fact that people learn, 
and their willingness to cooperate is modified by their personal experience. The 
dynamics of one's willingness to cooperate is at the core of the phenomenon of 
competitive altruism. In [S] , we read: " Competitive altruism is based on two sim- 
ple premises. First it assumes that there are individual differences in altruism. 
(...) Second, in forming alliances there is competition for the most moral and 
cooperative partners." In our previous formulation [S], only the first premise was 
included directly: the players differed in the values of their altruism. However, 
it is clear that in competing with others, one must modify own behaviour as to 
outperform the rivals. Therefore, individual altruisms should vary as well, if the 
second premise is to be included to the model. 

With this premise in mind, we propose here two schemes of varying the play- 
ers' altruism after each game. According to our first option, say A, altruism is 
updated in the same way as the reputation in only somewhat slower. This 
small difference seems realistic: we change our opinions on other people almost 
immediately after getting new information about them. Also, the difference is 
dictated by our aim to compare the results with those in [3], where altruism of 
each agent was constant. Second option B introduces a modification which is 
suggested by Scheff theory of shame and pride [TU|. According to this theory, a 
mutual respect of two agents expressed by their cooperation enhances their self- 
evaluation, what in turn reinforces their willingness to cooperate. On the other 
hand, a cooperating agent is humiliated when mets a defection, what reduces 
her/his willingness to cooperate. 

How these modifications of altruism influence the ability of the population 
to cooperate? In the option A, strategies of cooperation and of defection are 
equivalent, then the solution should be symmetric with respect to interchange of 
these strategies. As we explain in detail in the next section, in the option B two 
succesful cooperators have their altruisms increased, but this event can happen 
with the probability equal to, roughly, squared concentration of cooperators. If 
this concentration is initially 1/2, the process should be neutralized by a de- 



crease of altruism of an unsuccessful cooperator. Below we show that this is not 
the case; the rules, apparently symmetric, promote the cooperation. 

Actually, it seems that the coupling between the willingnesses to cooperate 
of different players is more general than the effect of competitive altruism or of 
the loops of shame and pride. This coupling can be viewed as a general ability of 
a set of players to establish a social norm of cooperation, and this norm in turns 
allows to define a social group. The role of norms in establishing social groups 
and societies is much too wide a subject to be discussed here |4I11| . Instead, we 
refer only to the definition of a social norm [12j , which clearly underlines the role 
of mutual expectation of agents: once they believe they recognize the attitudes 
of the others, a germ of a norm is established. 

This ambiguity suggests, that it is desirable to look for another mechanism, 
not due to the modification of altruism, which leads to an enhancement of co- 
operation. Our choice is to check the scenario where altruism of each agent is 
constant, but the modification of reputation depends on the reputation of co- 
player. In this way, the parameter controlling the change of reputation of agent 

1 is just the reputation of i's coplayer j. This choice, marked here as option C, is 
motivated by the mechanism of indirect reciprocity and punishment, discussed 
e.g. in |13| . In other words, the modification of reputation in a game with an 
agent with bad reputation remains small. 

The outline of the paper is as follows. In the next section we explain the 
original version of the model [5] and options A, B and C considered here. In the 
third section numerical results are described. There we demonstrate that when 
altruism is varied, all players defect or all cooperate in the stationary state. This 
bistability with two homogeneous states in the main result of the paper. We 
show also, that for the option C, cooperation is promoted. The last section is 
devoted to discussion. 

2 The model 

The system is equivalent to a fully connected graph of N nodes, with a link 
between each pair of agents at the nodes. Each agent i is endowed with two 
parameters: altruism Si and reputation Wi. Initial values of the parameters 
are selected randomly from homogeneous distributions: p{ei) is constant for 
El < Ei < E2, otherwise p{ei) — 0, and p{Wi) is constant for Vi < Wi < V2, 
otherwise p{Wi) — 0. 

During the simulation, a pair of nodes is selected randomly. The prob- 
ability that i cooperates with j is 



P{i,j) = F{e, + W,) 



(1) 
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Fig. 1. The reputation distribution n{W) for an exemplary system of = 10^ 
agents, evolving according to the option A, at the early stage after 10^ games. 
Here, the initial distribution of altruism is symmetric (case ii)). After a relatively 
quick development of groups of agents with good and bad reputation, slowly 
those of good reputation start to dominate - the right maximum grows larger. 



where F{x) = if a; < 0, F{x) = x if < x < 1 and F{x) = 1 if .t > 1. In [9], 
it was only reputations Wi, Wj what evolved in time. If i cooperated, her/his 
reputation was transformed as Wi — !> {I + Wi)/2, otherwise Wi — >• Wi/2. Here 
we set Wi to change in the same way. 

Moreover - and this is a new element - we allow also the altruism to change. 
In the option A, this change is ruled according to a similar prescription. If i,j 
play, the altruism of j varies as 

e, ^ e, + (±1/2 - e,)x (2) 

where < a; < 1 is a parameter which measures the velocity of change of altru- 
ism, the sign +1 applies if i cooperates and the sign — 1 - if i defects. In other 
words, the altruism Ej increases if j mcts cooperation, decreases otherwise. As 
long as a; < 1/2, the time evolution of altruism is slower than the one of rep- 
utation. Here we set x = 0.1. Larger values of x just speed up the changes of 
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Fig. 2. The altruism distribution 71(e) for an exemplary system of = 10'^ 
agents, evolving according to the option B. After 10^ games, the distribution is 
almost stable. Here, the initial distribution of altruism promotes defection (case 
^)). 

agents' altruism. 

In the option B, it is only the rule of variation of Si what is changed with 
respect of the option A. Namely, if both i and j cooperate, their altruism increase 
as 

e, -^e, + (1/2 - s,)x (3) 
^ + (1/2 - ej)x (4) 

If i cooperates and j defects, then the altruism of i is reduced as 

Ei ->-e, + (-1/2 - ei)x (5) 
whilst Ej is not changed. When both i and j defect, nothing is changed. 

In the option C, altruism remains constant, as in [S]. The velocity of the 
variation of reputation of i is controhed by the reputation of her/his coplayer j. 
Namely, when i cooperates, then her/his reputation Wi changes as 
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Fig. 3. Mean value of reputation after 10^ games for a system of = 10^ agents 
evolving according to the option C, against the parameter z which controls 
the speed of evolution of reputation. Three curves are for three cases of initial 
conditions, i), n) and in) from the bottom to the top. The statistics is collected 
for 10'^ systems. 

-J- W,{\ - zWj) + zWj (6) 
where z is a parameter. When i defects, 

W.^W^ll- zW,). (7) 

3 Results 

For the option A, the problem is symmetric with respect to an interchange of 
the strategies: cooperation and defection. When also the initial distributions of 
both reputation and altruism are symmetric (Vi + V2 = 1 and Ei + E2 =0), 
this symmetry should be preserved also in the solution. This is so, however, only 
in the statistical sense. For each simulation, the system breaks the symmetry 
spontaneously and the time evolution leads to one of two homogeneous states, 
where each agent adopts the same strategy. In one of these two states, all agents 
cooperate; in another, all defect. The process of spontaneous symmetry breaking 
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Fig. 4. The probabilities of four outcomes of Prisoner's Dilemma game for the 
system evolving according to the option C. Here, the initial distribution of altru- 
ism is symmetric (case ii)). The statistics is collected for an exemplary system 
of = 10'^ agents after 10^ games. 



is visible in Fig. 1. There, we show the probability distributions of reputation 
and altruism for N = 10'^ agents at the early stage of the process, after 10^ 
games. For the symmetric initial state where e = and Vi = V2 = 0.5, the result 
obtained numerically as an average over 10"^ systems, each after 10^ games, is 
that the probability of the state "all cooperate" is 0.48. As a rule, this proba- 
bility is found to be close to 0.5 at the straight line W = 1/2 — e. Above this 
line all cooperate, below this line all defect, except the vicinity of the line (of 
width about 0.1 for N — 10'^), where the probability of the state "all cooperate" 
changes continuously from to 1. This means that a manipulation of the initial 
state (e, W) modifies the final result, which still remains homogeneous. In partic- 
ular, when the initial value of e is shifted by 0.1 downwards, all defect; upwards, 
all cooperate. The properties of the boundary W ~ 1/2 — e are a consequence 
of the adopted form of P{i,j), but the homogeneity of stationary states comes 
from the system dynamics. The results are obtained for a; = 0.1; an increase of 
X just speeds the process up. For x = 1/2, the time dependent mean square root 
of reputation and altruism remain equal, when decreasing to zero. 




For the option B, wc observe the same homogeneous states "all eooperate" or 
"all defect", but the probabilities of these states are different. Again we use the 
same initial reputation for all agents, and the same statistics. For each system 
of = 10^ agents we made three runs, each of 100 timesteps, with different 
initial conditions: i) Vi = —0.5 and V2 = 0.3, ii) V\ = —0.4 and V2 = 0.4, in) 
V\ = —0.3 and V2 = 0.5. While the initial conditions ii) are neutral, the case i) 
promotes defection and the case Hi) promotes cooperation. For the option A, the 
same initial conditions served to demonstrate the symmetry of two strategies. 
However, for the option B we observe that in the case ii) cooperation prevails. 
The obtained numbers for i) are Wi = for all agents, and the average Ei for 
each system is —.19 ±0.15. In the case Hi) all cooperate, and their altruism 
is +0.5. In the case ii) wc get again two homogeneous states: "all cooperate" 
with probability 0.88, "all defect" with probability 0.12. In the former state, the 
altruism of all agents is maximal: 0.5. In the latter, for each system the average 
£ is —0.15 ± .11. Note that in the option B, the altruism of uncooperative agents 
remains unchanged, hence its spread in the uncooperative phase. An exemplary 
plot of the altruism in this case is shown in Fig. 2. Again, these results are ob- 
tained for X = 0.1. When x increases to 1, the average altruism e drops to its 
minimal value -0.5 almost linearly with x. 

For the option C, calculations are made for the three above given initial 
conditions and the same statistics. In this option, the altruism of each agent 
remain unchanged, then for theinitial conditions i) and Hi) a permanent bias is 
present in the system towards defection and cooperation, respectively. However, 
the system evolution (Eqns. 6 and 7) produces another bias, always towards 
cooperation. Clearly, there is no homogeneous phase here. The obtained values 
of mean reputations for each of lO'^ systems are practically the same. These re- 
sults are shown in Fig. 3, as dependent on the parameter z. To demonstrate the 
character of the latter bias, wc show also in Fig. 4 an example of the plots of 
probability of a common cooperation (R) , of a common defection (U) , of cooper- 
ating but being defected (S) and of defecting a cooperating co-player (T). Plots 
of the same character were shown in [5] for the symmetric case where altruism 
is constant. As we see in the plots, in the option C the cooperation is promoted. 



4 Discussion 

The result of our simulation is that once the altruism is allowed to evolve, in 
long time limit the simulated players adopt one strategy, the same for the whole 
population. This strategy is either to cooperate, or to defect. For the adopted 
initial distributions of £i and Wi, basically the final outcome is determined by 
the initial mean values W = (Vi -I- V2)/2 and e = [Ei + i?2)/2 as follows: once 
W + £ > 1/2, the final strategy is to cooperate, otherwise the strategy is to 
defect. This is true except the case when W + e ~ 1/2. In this case it is possible 
that the whole population defects or cooperates; the respective probabilities vary 



with W + s. This result is new and completely different from the case a; = 0, con- 
sidered previously [S] . It is different also from the results obtained above for the 
model of indirect reciprocity, where altruism does not vary. In the latter model 
the mechanism which stabilizes cooperation is that agents with small (bad) rep- 
utation, even if defected, do not influence the reputation of co-players. 

As remarked above, the time evolution of human general attitudes to coop- 
erate is expected to vary slower than the opinions on particular co-players, and 
it seems reasonable to believe that the former is driven by the latter. We would 
like to stress that as a rule, what is observed in social phenomena is an inter- 
play of transient effect with different characteristic times. Then, conclusions of 
modeling should be related rather to the direction of the process than to the sta- 
tionary state in the long time limit. In particular, our model takes into account 
a coupling between an agent's experience on the behaviour of the others and 
the overall willingness of this agent to cooperate. Our results indicate that the 
feedback is positive; more cooperation bears more altruism what in turn leads 
to more cooperation. As a rule, an agent's experience that cooperation is met 
in most cases leads to a general belief that to cooperate is an accepted social 
norm. Then, in our theories on experimental data we should consider rather the 
direction of the process than its stationary stage. One of experiments of this kind 
was conducted in the Swiss army fl4| . within the Prisoner's Dilemma scheme. 
There, platoons of males were formed in a random way for 4-week period of 
officer training. Having finished the training, individuals believed that members 
of their own platoons were more willing to cooperate, than others. More data on 
social experiments can be found in |15j . 

As we noted in the Introduction, there is no direct one-to-one correspondence 
between social effects and theories; the same effect can be discussed within more 
than one theory. It seems worthwhile to refer here again to the Scheff theory of 
shame and pride |10ll6j . This theory describes the self-stabilizing consequences 
of social interaction to a loop of pride and a loop of shame. Again, as in the 
definition of social norm by Bicchieri [T2] , the crucial factor of interaction is the 
content of mutual expectation. On the laboratory side, we have the same am- 
biguity. A recent list of reputation-based experiments was composed by Binglin 
Gong and Chun- Lei Yang in |17j . In the same paper a new experiment of this 
kind is described, and the interpretation provided is again twofold: either the 
indirect reciprocity, or some "sense of justice" of the participants. Concluding, 
at the present stage of social theory we have to accept, that both in the case 
of experiment and of simulation an interpretation is not unique. It seems to us 
that identifying connections between different effects and/or properties of social 
systems is less ambiguous. 
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